Avian leukosis viruses and polypeptide display

ABSTRACT

The invention provides methods and materials involved in displaying polypeptide sequences using viruses such as avian leukosis viruses. Specifically, the invention provides nucleic acid molecules, collections of nucleic acid molecules, polypeptides, collections of polypeptides, viruses, and collections of viruses as well as methods for making nucleic acid molecules, collections of nucleic acid molecules, polypeptides, collections of polypeptides, viruses, and collections of viruses. The invention also provides methods for obtaining displayed polypeptide sequences that interact with biological molecules and/or cells as well as methods for identifying biological molecules that interact with displayed polypeptides.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a divisional application of U.S. application Ser.No. 10/098,935, filed Mar. 13, 2002.

BACKGROUND

1. Technical Field

The invention relates to methods and materials involved in displayingpolypeptide sequences using viruses such as avian leukosis viruses.

2. Background Information

Display technology involves generating libraries of modularly codedbiomolecules and screening those biomolecules for particular properties.One feature of display technology is to link a particular phenotype(e.g., a displayed polypeptide) to its genotype (e.g., a nucleic acidencoding the displayed polypeptide) so that the genotypes of selectedphenotypes can be rapidly identified. Polypeptide display systemsinclude viral display systems as well as cell-based display systems.Viral and cell-based display systems have the ability to amplify theselected population of displayed polypeptides.

Phage display has been used extensively as a platform for polypeptidedisplay, accommodating a wide-range of polypeptides from smallpolypeptides to single chain antibodies. For example, phage displaylibraries have been used to select polypeptides that specifically bindto unique antigens on immobilized polypeptides and to targeted receptorson cultured cells (Li, M., Nat. Biotech., 18:1251-1256 (2000)). Inaddition, in vivo selection strategies of phage display polypeptidelibraries in mice have been developed (Pasqualini and Ruoslahti, Nature,380:364-366 (1996)). These selection strategies allow cells, organs, andtumors to be studied in their natural environments, a complexity that isdifficult to model in vitro. Thus, the power of polypeptide displaytechnology for identifying new therapeutic targets such as targets forcancer treatment both in vitro and in vivo is clear.

SUMMARY

The invention provides methods and materials involved in displayingpolypeptide sequences using viruses such as avian leukosis viruses(ALV). Specifically, the invention provides nucleic acid molecules,collections of nucleic acid molecules, polypeptides, collections ofpolypeptides, viruses, and collections of viruses. The invention alsoprovides methods for making nucleic acid molecules, collections ofnucleic acid molecules, polypeptides, collections of polypeptides,viruses, and collections of viruses.

The nucleic acid molecules and collections of nucleic acid moleculesprovided herein can encode ALV surface glycoproteins having N-terminalpolypeptide extensions. Such nucleic acid molecules and collections ofnucleic acid molecules can be used to produce ALV surface glycoproteinshaving N-terminal polypeptide extensions as well as viruses containing(1) ALV surface glycoproteins having N-terminal polypeptide extensionsand/or (2) nucleic acid molecules encoding ALV surface glycoproteinshaving N-terminal polypeptide extensions. As described herein, viruses(e.g., ALV) containing ALV surface glycoproteins having N-terminalpolypeptide extensions can be used as a polypeptide display platform,providing researchers with a powerful tool for, inter alia, identifyingnew therapeutic targets such as targets for cancer treatment.

In addition, the invention provides methods for obtaining displayedpolypeptide sequences that interact with biological molecules (e.g.,cell receptors and cell glycoproteins) and/or cells (e.g., cancercells). For example, the methods and materials provided herein can beused to obtain displayed polypeptides that bind cell surface receptors,that mimic the properties of other polypeptides, or that bind specificcells or tissue surfaces. Likewise, the methods and materials providedherein can be used to identify optimal binding substrates and toelucidate polypeptide interactions such as polypeptide-polypeptideinteractions and polypeptide-carbohydrate interactions. Such methods canhelp researchers develop new reagents to treat conditions such ascancer, autoimmunity, infections (e.g., viral infections, bacterialinfections, and fungal infections), and central nervous system disorders(e.g., Parkinson's disease, Huntington's Disease, and Alzheimer'sdisease).

The invention provides methods for identifying biological molecules(e.g., cell receptors and cell glycoproteins) that interact withdisplayed polypeptides. Identifying biological molecules such as cellreceptors primarily expressed by tumor cells can help researchersdevelop new reagents that specifically target those identifiedbiological molecules. For example, identifying a cell surface receptorthat is only expressed by breast tumor cells can help researchersdevelop drugs that target and destroy only breast tumor cells.

The invention is based on the discovery that ALV surface glycoproteinshaving N-terminal polypeptide extensions of various lengths can beefficiently incorporated into infectious virions. The invention also isbased on the discovery that viruses containing ALV surface glycoproteinshaving N-terminal polypeptide extensions of various lengths canreplicate efficiently, reaching infectious titers comparable towild-type viruses. In addition, the invention is based on the discoverythat viruses containing ALV surface glycoproteins having N-terminalpolypeptide extensions of various lengths can (1) stably retain theN-terminal polypeptide extensions after repeated virus repassage and (2)bind both specific immobilized ligands as well as cells expressingspecific ligands.

In one aspect, the invention features a nucleic acid molecule containinga first nucleic acid sequence, where the first nucleic acid sequenceencodes a first polypeptide containing an avian leukosis virus surfaceglycoprotein amino acid sequence and a first amino acid sequence, wherethe first amino acid sequence is heterologous to naturally occurringavian leukosis virus amino acid sequences, and where the first aminoacid sequence is attached to the amino-terminal portion of the avianleukosis virus surface glycoprotein amino acid sequence. The first aminoacid sequence can be between five and 500 amino acid residues in length,between ten and 250 amino acid residues in length, or between 15 and 100amino acid residues in length. The first amino acid sequence can containa sequence from a receptor, receptor ligand, immunoglobulin, enzyme, orenzyme substrate. The avian leukosis virus surface glycoprotein aminoacid sequence can contain a sequence as set forth in SEQ ID NO: 1, 2, 3,4, 5, or 6. The nucleic acid molecule can encode an avian leukosis virustransmembrane glycoprotein amino acid sequence. The first polypeptidecan form a covalent attachment with an avian leukosis virustransmembrane glycoprotein when the first polypeptide is part of anavian leukosis virus. The nucleic acid molecule can contain a secondnucleic acid sequence. The second nucleic acid sequence can beheterologous to naturally occurring avian leukosis virus sequences. Thesecond nucleic acid sequence can encode a second polypeptide. The secondpolypeptide can be between five and 500 amino acid residues in length,between ten and 250 amino acid residues in length, or between 15 and 100amino acid residues in length. The second polypeptide can be a receptor,receptor ligand, immunoglobulin, enzyme, or enzyme substrate. Thenucleic acid molecule can contain a retroviral 5′-LTR sequence, aretroviral gag sequence, a retroviral pol sequence, and a retroviral3′-LTR sequence. The second nucleic acid sequence can be located betweenthe first nucleic acid sequence and the retroviral 3′-LTR sequence. Theretroviral 5′-LTR sequence, the retroviral gag sequence, the retroviralpol sequence, and the retroviral 3′-LTR sequence can be avian leukosisvirus sequences. The nucleic acid molecule can encode areplication-competent avian leukosis virus or a replication-defectiveavian leukosis virus.

In another embodiment, the invention features a plurality of nucleicacid molecules, where each nucleic acid molecule encodes a firstpolypeptide containing an avian leukosis virus surface glycoproteinamino acid sequence and a first amino acid sequence, where the firstamino acid sequence is heterologous to naturally occurring avianleukosis virus amino acid sequences, and where the first amino acidsequence is attached to the amino-terminal portion of the avian leukosisvirus surface glycoprotein amino acid sequence. The avian leukosis virussurface glycoprotein amino acid sequence of each first polypeptide canbe identical. The first amino acid sequence of each first polypeptidecan be different. Each of the plurality of nucleic acid molecules canencode an avian leukosis virus transmembrane glycoprotein amino acidsequence. Each first polypeptide can form a covalent attachment with anavian leukosis virus transmembrane glycoprotein when each firstpolypeptide is part of an avian leukosis virus. Each of the plurality ofnucleic acid molecules can contain a second nucleic acid sequence thatencodes a second polypeptide.

Another aspect of the invention features a polypeptide containing anavian leukosis virus surface glycoprotein amino acid sequence and afirst amino acid sequence, where the first amino acid sequence isheterologous to naturally occurring avian leukosis virus amino acidsequences, and where the first amino acid sequence is attached to theamino-terminal portion of the avian leukosis virus surface glycoproteinamino acid sequence. The first amino acid sequence can be between fiveand 500 amino acid residues in length, between ten and 250 amino acidresidues in length, or between 15 and 100 amino acid residues in length.The first amino acid sequence can contain a sequence from a receptor,receptor ligand, immunoglobulin, enzyme, or enzyme substrate. The avianleukosis virus surface glycoprotein amino acid sequence can contain asequence as set forth in SEQ ID NO: 1, 2, 3, 4, 5, or 6. The polypeptidecan form a covalent attachment with an avian leukosis virustransmembrane glycoprotein when the polypeptide is part of an avianleukosis virus.

In another embodiment, the invention features a plurality ofpolypeptides, where each polypeptide contains an avian leukosis virussurface glycoprotein amino acid sequence and a first amino acidsequence, where the first amino acid sequence of each polypeptide isheterologous to naturally occurring avian leukosis virus amino acidsequences, and where the first amino acid sequence of each polypeptideis attached to the amino-terminal portion of the avian leukosis virussurface glycoprotein amino acid sequence. The avian leukosis virus aminoacid sequence of each polypeptide can be identical. The first amino acidsequence of each polypeptide can be different. Each polypeptide can forma covalent attachment with an avian leukosis virus transmembraneglycoprotein when part of an avian leukosis virus.

Another aspect of the invention features a virus containing a nucleicacid molecule containing a first nucleic acid sequence, where the firstnucleic acid sequence encodes a first polypeptide containing an avianleukosis virus surface glycoprotein amino acid sequence and a firstamino acid sequence, where the first amino acid sequence is heterologousto naturally occurring avian leukosis virus amino acid sequences, andwhere the first amino acid sequence is attached to the amino-terminalportion of the avian leukosis virus surface glycoprotein amino acidsequence. The virus can be a retrovirus (e.g., an avian leukosis virusor a murine leukemia virus). The virus can contain the firstpolypeptide. The nucleic acid molecule can encode an avian leukosisvirus transmembrane glycoprotein amino acid sequence. The firstpolypeptide can form a covalent attachment with an avian leukosis virustransmembrane glycoprotein when the first polypeptide is part of anavian leukosis virus. The virus can contain an avian leukosis virustransmembrane glycoprotein, and the first polypeptide can form acovalent attachment with the avian leukosis virus transmembraneglycoprotein. The nucleic acid molecule can contain a second nucleicacid sequence, the second nucleic acid sequence being heterologous tonaturally occurring avian leukosis viruses. The second nucleic acidsequence can encode a second polypeptide. The virus can contain thesecond polypeptide. The second polypeptide can be a receptor, receptorligand, immunoglobulin, enzyme, or enzyme substrate. The second nucleicacid sequence can be located between an env viral sequence and a 3′ LTRviral sequence. The virus can be replication-competent orreplication-defective.

In another embodiment, the invention features a virus containing a firstpolypeptide, where the first polypeptide contains an avian leukosisvirus surface glycoprotein amino acid sequence and a first amino acidsequence, where the first amino acid sequence is heterologous tonaturally occurring avian leukosis virus amino acid sequences, and wherethe first amino acid sequence is attached to the amino-terminal portionof the avian leukosis virus surface glycoprotein amino acid sequence.The virus can be a retrovirus (e.g., an avian leukosis virus or a murineleukemia virus). The first polypeptide can form a covalent attachmentwith an avian leukosis virus transmembrane glycoprotein when the firstpolypeptide is part of an avian leukosis virus. The virus can contain anavian leukosis virus transmembrane glycoprotein, and the firstpolypeptide can form a covalent attachment with the avian leukosis virustransmembrane glycoprotein. The virus can contain a nucleic acidmolecule containing a first nucleic acid sequence, where the firstnucleic acid sequence encodes the first polypeptide. The nucleic acidmolecule can contain a second nucleic acid sequence, where the secondnucleic acid sequence is heterologous to naturally occurring avianleukosis viruses. The second nucleic acid sequence can encode a secondpolypeptide. The second polypeptide can be a receptor, receptor ligand,immunoglobulin, enzyme, or enzyme substrate. The second nucleic acidsequence can be located between the first nucleic acid sequence and a 3′LTR viral sequence. The virus can be replication-competent orreplication-defective.

Another embodiment of the invention features a plurality of viruses,where each virus contains a nucleic acid molecule containing a firstnucleic acid sequence, where the first nucleic acid sequence encodes afirst polypeptide, where each first polypeptide contains an avianleukosis virus surface glycoprotein amino acid sequence and a firstamino acid sequence, where the first amino acid sequence is heterologousto naturally occurring avian leukosis virus amino acid sequences, andwhere the first amino acid sequence is attached to the amino-terminalportion of the avian leukosis virus surface glycoprotein amino acidsequence. The avian leukosis virus surface glycoprotein amino acidsequence of each first polypeptide can be identical. The first aminoacid sequence of each first polypeptide can be different. Each virus cancontain the first polypeptide. The nucleic acid molecule of each viruscan contain a second nucleic acid sequence. The second nucleic acidsequence of each virus can be different. The second nucleic acidsequence can encode a second polypeptide. Each virus can contain thesecond polypeptide. Each virus can be replication-competent orreplication-defective. The plurality can be at least 500.

Another embodiment of the invention features a plurality of viruses,where each virus contains a first polypeptide, where each firstpolypeptide contains an avian leukosis virus surface glycoprotein aminoacid sequence and a first amino acid sequence, where the first aminoacid sequence is heterologous to naturally occurring avian leukosisvirus amino acid sequences, and where the first amino acid sequence isattached to the amino-terminal portion of the avian leukosis virussurface glycoprotein amino acid sequence. The avian leukosis virussurface glycoprotein amino acid sequence of each first polypeptide canbe identical. The first amino acid sequence of each first polypeptidecan be different. Each virus can contain a nucleic acid moleculecontaining a first nucleic acid sequence, where the first nucleic acidsequence encodes the first polypeptide. The nucleic acid molecule ofeach virus can contain a second nucleic acid sequence. The secondnucleic acid sequence of each virus can be different. The second nucleicacid sequence can encode a second polypeptide. Each virus can containthe second polypeptide. Each virus can be replication-competent orreplication-defective. The plurality can be at least 500.

Unless otherwise defined, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which this invention pertains. Although methods and materialssimilar or equivalent to those described herein can be used in thepractice or testing of the present invention, suitable methods andmaterials are described below. All publications, patent applications,patents, and other references mentioned herein are incorporated byreference in their entirety. In case of conflict, the presentspecification, including definitions, will control. In addition, thematerials, methods, and examples are illustrative only and not intendedto be limiting.

Other features and advantages of the invention will be apparent from thefollowing detailed description, and from the claims.

DESCRIPTION OF DRAWINGS

FIG. 1 is a schematic representation of the ALV(A) retroviral vector,the wild-type construct 1, and the chimeric envelope glycoproteinconstructs 2-5. The ALV-based retroviral vector contains the gag, pol,and env viral sequences and nucleic acid encoding an alkalinephosphatase polypeptide flanked by long terminal repeats (LTR). Theenvelope glycoproteins are translated from a spliced mRNA and contain asignal peptide (including six amino acids from the start of Gag)followed by a protease cleavage site at the start of the mature surfaceglycoprotein (+1). All chimeric envelope glycoproteins containedadditional epitopes inserted in frame at the amino-terminus of the envsequence (+1). The bolded and underlined FLAG represents the eight aminoacid FLAG® epitope; the bolded and underlined EGF represents a 53-aminoacid EGF ligand; and the G4S represents four glycine residues followedby a serine residue. The AAQPA (SEQ ID NO:8), IEGR (SEQ ID NO:9), andAAA sequences represent the amino acid sequences of an Sfi I site, aFactor Xa cleavage site, and a Not I site, respectively. The SDrepresents a splice donor, while the SA represents a splice acceptor.

FIG. 2 is a graph plotting virus growth (OD₄₉₀) versus days posttransfection for viruses produced from cells either mock transfected ortransfected with the indicated construct.

FIG. 3 contains photographs from Western immunoblots performed using theindicated antibodies. In each case, lane 1 contained a sample made froma mock transfection; lane 2 contained a sample made using WT ALV(A)(construct 1); lane 3 contained a sample made using WT+FLAG (construct2); lane 4 contained a sample made using WT+1EGF (construct 3); lane 5contained a sample made using WT+2EGF (construct 4); and lane 6contained a sample made using WT+3EGF (construct 5).

FIG. 4 contains a photograph from a Western immunoblot performed usingthe indicated antibody and sample treated with (+) or without (−) FactorXa. Lanes 1 and 2 contained a sample made from a mock transfection;lanes 3 and 4 contained a sample made using WT ALV(A) (construct 1);lanes 5 and 6 contained a sample made using WT+FLAG (construct 2); lanes7 and 8 contained a sample made using WT+1EGF (construct 3); lanes 9 and10 contained a sample made using WT+2EGF (construct 4); and lanes 11 and12 contained a sample made using WT+3EGF (construct 5).

FIG. 5 contains graphs plotting virus growth (OD₄₉₀) versus days postinfection for first and second re-passages of viruses produced fromcells either mock transfected or transfected with the indicatedconstruct.

FIG. 6 contains photographs from Western immunoblots performed using theindicated antibodies and samples obtained from either first or secondre-passages. In each case, lane 1 contained a sample made from a mocktransfection; lane 2 contained a first or second re-passage sample madeusing WT ALV(A) (construct 1); lane 3 contained a first or secondre-passage sample made using WT+FLAG (construct 2); lane 4 contained afirst or second re-passage sample made using WT+1EGF (construct 3); lane5 contained a first or second re-passage sample made using WT+2EGF(construct 4); and lane 6 contained a first or second re-passage samplemade using WT+3EGF (construct 5).

FIG. 7 is eight FACS graphs plotting cell counts versus fluorescence(FL2-Height) for A431 cells incubated with viruses made using theindicated constructs either in the presence or absence of 1 μMrecombinant EGF.

FIG. 8 is a schematic representation of the steps that can be used tomake an ALV polypeptide display library. The SD represents a splicedonor, while the SA represents a splice acceptor.

FIG. 9 is a schematic representation of the ALV(A) retroviral vector ofan ALV library designed to contain linear 10-mer polypeptides, X₁₀,randomized at all positions. The AAQPA (SEQ ID NO:8) and AAA sequencesrepresent the amino acid sequences of an Sfi I site and a Not I site,respectively. The G4S represents four glycine residues followed by aserine residue. The SD represents a splice donor, while the SArepresents a splice acceptor.

FIG. 10 is a sequence alignment of five ALV surface glycoprotein aminoacid sequences. The first sequence designated T-RCASBP(A)SU representsSEQ ID NO:1; the second sequence designated T.RAV-2 env.1 represents SEQID NO:2; the third sequence designated T.PrRSV(C)SU represents SEQ IDNO:3; the fourth sequence designated T.SR-D env.1 represents SEQ IDNO:4; and the fifth sequence designated T.RAV-O env represents SEQ IDNO:5. The sixth sequence listed under the first five sequencesrepresents a consensus sequence with each blank space or dot (.) beingany one of the amino acid residues aligned directly above thatparticular space or dot. For example, the space at position 238 of theconsensus sequence can be a lysine, threonine, or isoleucine. Thisconsensus sequence represents SEQ ID NO:6.

DETAILED DESCRIPTION

The invention provides methods and materials related to the display ofpolypeptide sequences using viruses such as ALV. Specifically, theinvention provides nucleic acid molecules, collections of nucleic acidmolecules, polypeptides, collections of polypeptides, viruses, andcollections of viruses as well as methods for making nucleic acidmolecules, collections of nucleic acid molecules, polypeptides,collections of polypeptides, viruses, and collections of viruses. Theinvention also provides methods for obtaining displayed polypeptidesequences that interact with biological molecules (e.g., cell receptorsand cell glycoproteins) and/or cells (e.g., cancer cells) as well asmethods for identifying biological molecules (e.g., cell receptors andcell glycoproteins) that interact with displayed polypeptides.

1. Nucleic Acid

The term “nucleic acid” as used herein encompasses both RNA and DNA,including cDNA, genomic DNA, and synthetic (e.g., chemicallysynthesized) DNA. The nucleic acid can be double-stranded orsingle-stranded. Where single-stranded, the nucleic acid can be thesense strand or the antisense strand. In addition, nucleic acid can becircular or linear.

The invention provides nucleic acid molecules that encode polypeptideshaving (1) an ALV surface glycoprotein amino acid sequence and (2) anamino acid sequence heterologous to any naturally occurring ALV aminoacid sequence. Typically, the heterologous amino acid sequence isattached to the amino-terminal portion of the ALV surface glycoproteinamino acid sequence. For example, the nucleic acid molecules of theinvention can encode polypeptides where each polypeptide has a differentamino acid sequence (e.g., a different non-ALV sequence) attached to theamino-terminal portion of an ALV surface glycoprotein amino acidsequence. The term “ALV surface glycoprotein amino acid sequence” asused herein refers to any amino acid sequence that is at least 65percent (e.g., at least 70, 75, 80, 85, 90, 95, 99, or 100 percent)identical to an ALV surface glycoprotein amino acid sequence as found innature. In addition, an ALV surface glycoprotein amino acid sequence canform a covalent attachment with an ALV transmembrane glycoprotein whenthey are expressed by a cell or incorporated into a virus. Such ALVsurface glycoprotein amino acid sequences include, without limitation,the amino acid sequences set forth in FIG. 10.

The percent identity between a particular amino acid sequence and an ALVsurface glycoprotein amino acid sequence found in nature is determinedas follows. First, the amino acid sequences are aligned using the BLAST2 Sequences (Bl2seq) program from the stand-alone version of BLASTZcontaining BLASTP version 2.0.14. This stand-alone version of BLASTZ canbe obtained from Fish & Richardson's web site (e.g., “www” dot “fr” dot“com” slash “blast” slash) or the U.S. government's National Center forBiotechnology Information web site (“www” dot “ncbi” dot “nlm” dot “nih”dot “gov”). Instructions explaining how to use the Bl2seq program can befound in the readme file accompanying BLASTZ. Bl2seq performs acomparison between two amino acid sequences using the BLASTP algorithm.To compare two amino acid sequences, the options of Bl2seq are set asfollows: -i is set to a file containing the first amino acid sequence tobe compared (e.g., C:\seq1.txt); -j is set to a file containing thesecond amino acid sequence to be compared (e.g., C:\seq2.txt); -p is setto blastp; -o is set to any desired file name (e.g., C:\output.txt); andall other options are left at their default setting. For example, thefollowing command can be used to generate an output file containing acomparison between two amino acid sequences: C:\Bl2seq -i c:\seq1.txt -jc:\seq2.txt -p blastp -o c:\output.txt. If the two compared sequencesshare homology, then the designated output file will present thoseregions of homology as aligned sequences. If the two compared sequencesdo not share homology, then the designated output file will not presentaligned sequences.

Once aligned, the number of matches is determined by counting the numberof positions where an identical amino acid residue is presented in bothsequences. The percent identity is determined by dividing the number ofmatches by the length of the full-length ALV surface glycoprotein aminoacid sequence followed by multiplying the resulting value by 100. Forexample, an amino acid sequence that has 273 matches when aligned withthe sequence set forth in SEQ ID NO:1 is 80.1 percent identical to thesequence set forth in SEQ ID NO:1 (i.e., 273÷341*100=80.1).

It is noted that the percent identity value is rounded to the nearesttenth. For example, 78.11, 78.12, 78.13, and 78.14 is rounded down to78.1, while 78.15, 78.16, 78.17, 78.18, and 78.19 is rounded up to 78.2.It also is noted that the length value will always be an integer.

Again, the nucleic acid molecules provided herein encode polypeptideshaving a heterologous amino acid sequence attached to the amino-terminalportion of an ALV surface glycoprotein amino acid sequence. Theamino-terminal portion of an amino acid sequence refers to any part ofthat amino acid sequence that is within at least the first 25amino-terminal amino acid residues (e.g., within at least the first 20,15, 10, 5, or less amino-terminal amino acid residues) of that aminoacid sequence. For example, a polypeptide having a 100-amino acidnon-viral sequence inserted between the fifth and sixth amino acidresidues of the amino acid sequence set forth in SEQ ID NO:1 is apolypeptide having an ALV surface glycoprotein amino acid sequence witha heterologous amino-terminal extension. It is noted that theheterologous amino acid sequences described herein can be attached to anALV surface glycoprotein amino acid sequence via a region other than anamino-terminal portion. For example, a heterologous amino acid sequencecan be attached to the first, second, third, or fourth 50 amino acidsegment of an ALV surface glycoprotein amino acid sequence.

The nucleic acid sequence that encodes the amino acid sequence attachedto the amino-terminal portion of an ALV surface glycoprotein amino acidsequence can encode any amino acid sequence heterologous to anynaturally occurring ALV amino acid sequence. Such nucleic acid sequencesinclude, without limitation, sequences that encode epitopes (e.g., theFLAG® epitope), ligands (e.g., the EGF ligand), protease cleavage sites(e.g., a Factor Xa cleavage site), linkers (e.g., a G4S linker), and/orrandomized amino acid sequences of any length. In addition, such nucleicacid sequences can encode linear polypeptides or cyclic polypeptides.For example, a randomized nucleic acid sequence can be flanked bycysteine residues such that the cysteine residues form a cyclicstructure via a covalent linkage. Further, such nucleic acid sequencescan encode an amino acid motif (e.g., an N-linked glycosylation signal)that is modified via glycosylation. For example, a nucleic acid sequencecan encode NXT or NXS; where N represents an asparagine residue, Xrepresents any amino acid residue, T represents a threonine residue, andS represents a serine residue. The length of the heterologous amino acidsequence attached to the amino-terminal portion of an ALV surfaceglycoprotein amino acid sequence can be greater than 5 (e.g., greaterthan 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 35, 50, 75, 100, 250,500, or 1000) amino acid residues. For example, the heterologous aminoacid sequence attached to the amino-terminal portion of an ALV surfaceglycoprotein amino acid sequence can be between 5 and 5000 amino acidresidues in length (e.g., between 5 and 1000, 5 and 500, 10 and 500, 10and 250, or 10 and 100 amino acid residues in length). In oneembodiment, a nucleic acid molecule within the scope of the inventioncontains, in the 5′ to 3′ direction, a first restriction enzyme cleavagesite, a sequence that encodes 10 to 50 amino acid residues, a secondrestriction enzyme cleavage site, a sequence that encodes a G4S linker,and a sequence that encodes an ALV surface glycoprotein.

The nucleic acid molecules provided herein can contain additionalnucleic acid sequences. For example, a nucleic acid molecule can containa nucleic acid sequence that encodes an ALV transmembrane glycoproteinamino acid sequence. Typically, the nucleic acid sequence encoding anALV transmembrane glycoprotein amino acid sequence is 3′ of the nucleicacid sequence encoding the ALV surface glycoprotein amino acid sequencesuch that the ALV surface glycoprotein amino acid sequence and the ALVtransmembrane glycoprotein amino acid sequence are translated from thesame mRNA molecule. While not being limited to any particular mechanismof action, it is believed that the ALV transmembrane glycoprotein aminoacid sequence is cleaved from the ALV surface glycoprotein amino acidsequence during or shortly after translation. In one embodiment, anucleic acid molecule of the invention can contain an entire envsequence from an ALV with a heterologous amino acid sequence attached tothe amino-terminal portion of that env sequence.

Additional nucleic acid sequences can be part of a nucleic acid moleculeof the invention. Such additional nucleic acid sequences include,without limitation, retroviral 5′-LTR sequences, retroviral gagsequences, retroviral pol sequences, and retroviral 3′-LTR sequences.For example, a nucleic acid molecule can contain, in the 5′ to 3′direction, an ALV 5′-LTR sequence, an ALV gag sequence, an ALV polsequence, a nucleic acid sequence encoding an ALV surface glycoproteinamino acid sequence with a heterologous amino acid sequence attached tothe amino-terminal portion of that ALV surface glycoprotein amino acidsequence, a nucleic acid sequence encoding an ALV transmembraneglycoprotein amino acid sequence, and an ALV 3′-LTR sequence. Othernucleic acid sequences can be included as well. For example, a nucleicacid molecule of the invention can contain a nucleic acid sequence ofany length between a retroviral env sequence and a retroviral 3′-LTRsequence. Such nucleic acid sequences can encode a polypeptide and canbe heterologous to nucleic acid sequences found in naturally occurringALV. For example, a nucleic acid located between a retroviral envsequence and a retroviral 3′-LTR sequence can encode a mammalianreceptor, a mammalian receptor ligand, an immunoglobulin (e.g.,single-chain antibody), an enzyme (e.g., alkaline phosphatase), anenzyme substrate, a growth factor, a cytokine, or a fragment thereof.

The nucleic acid molecules provided herein can be transcribed to form anRNA molecule that encodes a signal polypeptide followed by a proteasecleavage site followed by an amino acid sequence heterologous tonaturally occurring ALV amino acid sequences followed by an ALV surfaceglycoprotein amino acid sequence followed by an ALV transmembraneglycoprotein amino acid sequence. In this case, the sequence of thesignal polypeptide and protease cleavage site can be encoded by ALV gagand/or ALV env sequences. Once transcribed, the RNA molecule can betranslated to form a polypeptide. During or shortly after translation,the heterologous amino acid sequence can be cleaved from the signalpolypeptide via cleavage at the cleavage site, and the ALV surfaceglycoprotein amino acid sequence can be cleaved from the ALVtransmembrane glycoprotein amino acid sequence releasing a polypeptidecontaining the heterologous amino acid sequence attached to theamino-terminal portion of the ALV surface glycoprotein amino acidsequence and lacking the signal polypeptide, the protease cleavage site,and the ALV transmembrane glycoprotein amino acid sequence.

The nucleic acid molecules provided herein can contain ALV nucleic acidsequences such that cells (e.g., avian cells) transfected with thenucleic acid molecule produce infectious virus particles. Typically,such nucleic acid molecules contain, in the 5′ to 3′ direction, an ALV5′-LTR sequence, an ALV gag sequence, an ALV pot sequence, a nucleicacid sequence encoding an ALV surface glycoprotein amino acid sequencewith a heterologous amino acid sequence attached to the amino-terminalportion of that ALV surface glycoprotein amino acid sequence, a nucleicacid sequence encoding an ALV transmembrane glycoprotein amino acidsequence, and an ALV 3′-LTR sequence. It is noted that little or no ALVsurface glycoprotein is shed from infectious ALV particles because ALVsurface glycoproteins typically are covalently attached to ALVtransmembrane glycoproteins. It also is noted that an additional nucleicacid sequence having a length up to 2.5 kb can be inserted between thenucleic acid sequence encoding an ALV transmembrane glycoprotein aminoacid sequence and the ALV 3′-LTR sequence. This additional nucleic acidsequence can encode one or more polypeptides and can be heterologous tonucleic acid sequence found in naturally occurring ALVs. For example,this additional nucleic acid sequence can encode a mammalian receptor, amammalian receptor ligand, an immunoglobulin, an enzyme (e.g., alkalinephosphatase), or an enzyme substrate.

The nucleic acid molecules provided herein also can contain nucleic acidsequences such that the nucleic acid molecules encodereplication-competent retrovirus (e.g., replication-competent ALV). Forexample, a nucleic acid molecule of the invention can contain viralsequences such that replication-competent retroviruses expressingpolypeptides having a heterologous amino acid sequence attached to theamino-terminal portion of an ALV surface glycoprotein amino acidsequence are produced. As described herein, such a nucleic acid moleculecan be the ALV(A) retroviral vector containing a nucleic acid sequenceencoding a heterologous amino acid sequence that is inserted 5′ of theenv sequence.

Alternatively, the nucleic acid molecules provided herein can containnucleic acid sequences such that the nucleic acid molecules encodereplication-defective retrovirus (e.g., replication-defective ALV). Forexample, a nucleic acid molecule of the invention can contain viralsequences such that replication-defective retroviruses expressingpolypeptides having a heterologous amino acid sequence attached to theamino-terminal portion of an ALV surface glycoprotein amino acidsequence are produced.

Briefly, vectors encoding replication-competent or replication-defectiveretroviruses can be produced using standard virology techniques. Suchvectors can be based on any ALV, murine leukemia virus (MLV) MLV, spleennecrosis virus (SNV), feline leukemia virus (FeLV), felineimmunodeficiency virus (FIV), simian immunodeficiency virus (SIV), humanimmunodeficiency virus 1 or 2 (HIV-1; HIV-2), or equine infectiousanemia virus (EIAV) as well as any other enveloped virus such as herpessimplex viruses (HSV) or measles viruses.

As described herein, ALV surface glycoproteins having amino-terminalpolypeptide extensions of various lengths can be efficientlyincorporated into infectious virions. In addition, viruses containingALV surface glycoproteins having amino-terminal polypeptide extensionsof various lengths can replicate efficiently, reaching infectious titerscomparable to wild-type viruses. Further, viruses containing ALV surfaceglycoproteins having amino-terminal polypeptide extensions of variouslengths (1) can stably retain the amino-terminal polypeptide extensionsafter repeated virus repassage and (2) can bind both specificimmobilized ligands as well as cells expressing specific ligands. Thus,the nucleic acid molecules provided herein can be used to makepolypeptide display libraries containing infectious virions thatreplicate efficiently and stably present polypeptide sequences (e.g.,amino acid sequences heterologous to naturally occurring ALV amino acidsequences) that can bind specific molecules such as cell receptors.

Nucleic acid molecules within the scope of the invention can be obtainedusing any method including, without limitation, common molecular cloningand chemical nucleic acid synthesis techniques. For example, PCR can beused to construct nucleic acid molecules that encode polypeptides whereeach polypeptide has a different amino acid sequence (e.g., a differentnon-ALV sequence) attached to the amino-terminal portion of an ALVsurface glycoprotein amino acid sequence. PCR refers to a procedure ortechnique in which target nucleic acid is amplified in a manner similarto that described in U.S. Pat. No. 4,683,195, and subsequentmodifications of the procedure described therein.

2. Nucleic Acid Libraries

The invention provides collections of the nucleic acid moleculesdescribed herein. For example, the invention provides libraries ofdifferent nucleic acid molecules that encode polypeptides where eachpolypeptide has a different heterologous amino acid sequence (e.g., adifferent non-ALV sequence) attached to the amino-terminal portion of anALV surface glycoprotein amino acid sequence. As described herein, eachnucleic acid molecule within a library can encode areplication-competent retrovirus (e.g., replication-competent ALV) or areplication-deficient retroviruses (e.g., replication-deficient ALV).Typically, each nucleic acid molecule within a collection contains (1) anucleic acid sequence that encodes a polypeptide having a differentheterologous amino acid sequence attached to the amino-terminal portionof an ALV surface glycoprotein amino acid sequence and (2) viral nucleicacid sequences such that replication-competent retroviruses displayingthat polypeptide are produced. In this case, the nucleic acid moleculescan be used to create a library of retrovirus particles that (1) displaydifferent polypeptides having an ALV surface glycoprotein amino acidsequence with a heterologous amino-terminal extension and (2) containthe nucleic acid molecule that encodes that polypeptide. Thus,retroviruses that display a particular polypeptide having a heterologousamino-terminal extension with a desired activity can be selected andthen replicated such that the nucleic acid sequence encoding thatpolypeptide can be identified.

Again, the invention provides collections of nucleic acid molecules thatcan be used to generate retroviral polypeptide display libraries whereeach retroviral particle displays an ALV surface glycoprotein amino acidsequence with a unique heterologous amino-terminal extension. Forexample, each viral particle can have the same ALV surface glycoproteinamino acid sequence but a different heterologous amino-terminalextension. Typically, the collections of nucleic acid molecules willcontain a large number of different nucleic acid molecules. For example,a collection of nucleic acid molecules can contain greater than 500,10³, 10⁴, 10⁵, 10⁶, 10⁷, 10⁸, 10⁹, or 10¹⁰ different nucleic acidmolecules. Such collections of nucleic acid molecules can be obtainedusing standard molecule biology techniques such as molecular cloning andPCR. For example, restriction enzymes can be used to movepolypeptide-encoding sequences and fragments of polypeptide-encodingsequences from commercially available expression libraries intoretroviral vectors such as ALV(A). In addition, PCR can be used asdescribed in Buchholz et al. (Nat. Biotech., 16:951-954 (1998)) togenerate randomized nucleic acid sequences.

Each nucleic acid molecule of a collection of nucleic acid molecules cancontain an additional nucleic acid sequence that is (1) heterologous tonaturally occurring ALV sequences and (2) is located between an ALV envsequence and an ALV 3′LTR sequence. This additional nucleic acidsequence can be any length and can encode a polypeptide (e.g., anenzyme, cell receptor, or ligand). For example, this additional nucleicacid sequence can be 25, 50, 100, 150, 200, 300, 500, 1000, 1500, 2000,or more nucleotides in length. In addition, this additional nucleic acidsequence can be identical for each nucleic acid molecule of a collectionor it can be different for each nucleic acid molecule of a collection.For example, each nucleic acid molecule of a collection of nucleic acidmolecules that encodes a polypeptide having a different heterologousamino acid sequence attached to the amino-terminal portion of an ALVsurface glycoprotein amino acid sequence can contain an additionalnucleic acid sequence that encodes alkaline phosphatase and is locatedbetween an ALV env sequence and an ALV 3′LTR sequence. Alternatively,each nucleic acid molecule that encodes a polypeptide having a differentheterologous amino acid sequence attached to the amino-terminal portionof an ALV surface glycoprotein amino acid sequence can contain adifferent additional nucleic acid sequence located between an ALV envsequence and an ALV 3′LTR sequence. In this latter case, the collectionof nucleic acid molecules can be considered a combination of twodifferent libraries. One being a library of different amino-terminalextensions, and the other being a library of different additionalnucleic acid sequences.

Typically, each nucleic acid molecule within a double-library collectioncontains (1) a nucleic acid sequence that encodes a polypeptide having adifferent heterologous amino acid sequence attached to theamino-terminal portion of an ALV surface glycoprotein amino acidsequence, (2) an additional nucleic acid sequence located between an ALVenv sequence and an ALV 3 ′LTR sequence, where the additional nucleicacid sequence is heterologous to naturally occurring ALV sequences andencodes a polypeptide, and (3) viral nucleic acid sequences such thatreplication-competent retroviruses expressing both polypeptides areproduced. In this case, the nucleic acid molecules can be used to createa library of retrovirus particles that (1) display differentpolypeptides having an ALV surface glycoprotein amino acid sequence witha heterologous amino-terminal extension, (2) express differentheterologous polypeptides that are not attached to an ALV surfaceglycoprotein amino acid sequence, and (3) contain a nucleic acidmolecule that encodes both polypeptides. Thus, retroviruses that exhibita desired activity as a result of expressing particular combinations ofthe two varied polypeptides can be selected and then replicated suchthat the nucleic acid sequences encoding those two polypeptides can beidentified.

3. Polypeptides and Polypeptide Libraries

The invention provides polypeptides having an ALV surface glycoproteinamino acid sequence with a heterologous amino-terminal extension.Polypeptides having an ALV surface glycoprotein amino acid sequence witha heterologous amino-terminal extension can be substantially pure. Theterm “substantially pure” as used herein with reference to a polypeptidemeans the polypeptide is substantially free of other polypeptides,lipids, carbohydrates, and nucleic acid. Thus, a substantially purepolypeptide is any polypeptide that is at least about 65, 70, 75, 80,85, 90, 95, or 99 percent pure. Typically, a substantially purepolypeptide will yield a single major band on a non-reducingpolyacrylamide gel.

Any method can be used to obtain a polypeptide. For example, commonpolypeptide purification techniques such as affinity chromotography andHPLC as well as polypeptide synthesis techniques can be used. Inaddition, any material can be used as a source to obtain a polypeptidewithin the scope of the invention. For example, a retrovirus describedherein can be selected for having a desired activity and replicated sothat the nucleic acid sequence encoding the polypeptide responsible forthat desired activity is identified. Once identified, the nucleic acidsequence can be used to produce a polypeptide preparation. Thisresulting polypeptide preparation can then be used to study the desiredactivity, to produce antibodies, or to identify agonists or antagonistsof the desired activity.

The invention also provides collections of the polypeptides describedherein. For example, the invention provides libraries of differentpolypeptides where each polypeptide has a different heterologous aminoacid sequence (e.g., a different non-ALV sequence) attached to theamino-terminal portion of an ALV surface glycoprotein amino acidsequence. Typically, the collections of polypeptides will contain alarge number of different polypeptides. For example, a collection ofpolypeptides can contain greater than 500, 10³, 10⁴, 10⁵, 10⁶, 10⁷, 10⁸,10⁹, or 10¹⁰ different polypeptides. Such collections of polypeptidescan be obtained, for example, by cleaving surface polypeptides fromretroviral particles that display a polypeptide having an ALV surfaceglycoprotein amino acid sequence with a heterologous amino-terminalextension.

4. Viruses and Virus Libraries

The invention provides viruses, each virus containing a nucleic acidmolecule that encodes a polypeptide having an ALV surface glycoproteinamino acid sequence with a heterologous amino-terminal extension.Viruses containing such nucleic acid molecules are not required toexpress the encoded polypeptide. Nevertheless, such viruses typicallyexpress the encoded polypeptide. For example, an ALV containing anucleic acid molecule that encodes a polypeptide having an ALV surfaceglycoprotein amino acid sequence with a heterologous amino-terminalextension can display the encoded polypeptide on the surface of itsparticle.

Any virus can contain a nucleic acid molecule that encodes a polypeptidehaving an ALV surface glycoprotein amino acid sequence with aheterologous amino-terminal extension. Such viruses include, withoutlimitation, retroviruses such as ALVs, MLVs, SNVs, FeLVs, FIVs, SIVs,HIV-1, HIV-2, and EIAVs as well as other enveloped viruses such as HSVsand measles viruses. Viruses containing a nucleic acid molecule thatencodes a polypeptide having an ALV surface glycoprotein amino acidsequence with a heterologous amino-terminal extension can bereplication-competent or replication-defective. In addition, the nucleicacid molecule within the virus can contain any of the nucleic acidsequences described herein. For example, a retrovirus can contain anucleic acid molecule having (1) a nucleic acid sequence that encodes apolypeptide having an ALV surface glycoprotein amino acid sequence witha heterologous amino-terminal extension and (2) an additional nucleicacid sequence located between an ALV env sequence and an ALV 3′LTRsequence, where the additional nucleic acid sequence is heterologous tonaturally occurring ALV sequences and encodes a polypeptide. The virusesdescribed herein can lack Src viral sequences.

Any method can be used to identify viruses containing a nucleic acidmolecule of the invention. Such methods include, without limitation, PCRand nucleic acid hybridization techniques such as Northern and Southernanalysis. In some cases, immunohistochemistry and biochemical techniquescan be used to determine if a virus contains a particular nucleic acidmolecule by detecting the expression of a polypeptide encoded by thatparticular nucleic acid molecule.

The invention also provides viruses, each virus containing a polypeptidehaving (1) an ALV surface glycoprotein amino acid sequence and (2) anamino acid sequence heterologous to any naturally occurring ALV aminoacid sequence. Viruses containing such polypeptides are not required tocontain nucleic acid molecules that encode the polypeptide. For example,cell lines that express a polypeptide having an ALV surface glycoproteinamino acid sequence with a heterologous amino-terminal extension can beused to make viruses that display that polypeptide without containing anucleic acid sequence that encodes it. Nevertheless, a virus containinga polypeptide having an ALV surface glycoprotein amino acid sequencewith a heterologous amino-terminal extension typically will contain anucleic acid molecule that encodes that polypeptide. For example, an ALVcontaining a polypeptide having an ALV surface glycoprotein amino acidsequence with a heterologous amino-terminal extension displayed on thesurface of its particle typically contains a nucleic acid sequence thatencodes that polypeptide.

Any virus can contain a polypeptide having an ALV surface glycoproteinamino acid sequence with a heterologous amino-terminal extension. Suchviruses include, without limitation, retroviruses such as ALVs, MLVs,SNVs, FeLVs, FIVs, SIVs, HIV-1, HIV-2, and EIAVs as well as otherenveloped viruses such as HSVs and measles viruses. Viruses containing apolypeptide having an ALV surface glycoprotein amino acid sequence witha heterologous amino-terminal extension can be replication-competent orreplication-defective. In addition, the nucleic acid molecule within thevirus can contain any of the nucleic acid sequences described herein.For example, a retrovirus can contain (1) a polypeptide having an ALVsurface glycoprotein amino acid sequence with a heterologousamino-terminal extension and (2) a nucleic acid sequence located betweenan ALV env sequence and an ALV 3′LTR sequence, where the nucleic acidsequence is heterologous to naturally occurring ALV sequences andencodes a polypeptide. The viruses described herein can lack Src viralsequences.

Any method can be used to identify viruses containing a polypeptide ofthe invention. Such methods include, without limitation,immunohistochemistry and biochemical techniques.

The invention also provides collections of any of the viruses describedherein. For example, the invention provides libraries of differentviruses that display polypeptides where each polypeptide has a differentheterologous amino acid sequence attached to the amino-terminal portionof an ALV surface glycoprotein amino acid sequence. As described herein,each virus within a library can be a replication-competent retrovirus(e.g., replication-competent ALV) or a replication-deficient retrovirus(e.g., replication-deficient ALV). Typically, each virus within acollection (1) displays a polypeptide having a different heterologousamino acid sequence attached to the amino-terminal portion of an ALVsurface glycoprotein amino acid sequence on the surface of its particleand (2) contains a nucleic acid sequence that encodes the displayedpolypeptide. Thus, retroviruses that display a particular polypeptidehaving a heterologous amino-terminal extension with a desired activitycan be selected and then replicated such that the nucleic acid sequenceencoding that polypeptide can be identified.

The collections of viruses can contain a large number of differentviruses. For example, an ALV polypeptide display library can containgreater than 500, 10³, 10⁴, 10⁵, 10⁶, 10⁷, 10⁸, 10⁹, or 10¹⁰ differentmembers. Such collections of viruses can be obtained using thetechniques described herein. For example, PCR can be used as describedin Buchholz et al. (Nat. Biotech., 16:951-954 (1998)) to generaterandomized nucleic acid sequences that are inserted into theamino-terminal portion of an ALV glycoprotein amino acid sequence. Theresulting nucleic acid molecules can then be cloned into a retroviralvector. The resulting retroviral vectors can be transfected into cellssuch that retroviral particles are produced.

5. Methods for Obtaining Displayed Polypeptide Sequences

The invention provides methods for obtaining displayed polypeptidesequences that interact with biological molecules (e.g., cell receptorsand cell glycoproteins) and/or cells (e.g., cancer cells). Such methodsinclude (1) contacting a sample with one of the collections of virusesdescribed herein and (2) isolating any virus that binds to a componentwithin the sample. For example, an ALV polypeptide display librarycontaining greater than 10⁵ replication-competent ALVs where each virusdisplays an ALV surface glycoprotein having a different heterologousamino-terminal extension can be incubated with a sample. The sample canbe any type of biological sample such as immobilized polypeptides orcultured cells. Other examples of samples that can be used include,without limitation, cell suspensions, primary cultures, tissue sections,tissue dissections, cell homogenates, crude polypeptide preparations,purified polypeptide preparations, and carbohydrate preparations. Whenusing cells, the cells can be of any type and can be in vitro or invivo. For example, a cellular sample can contain cancer cells, livercells, neurons, lymphocytes, endothelial cells, skin cells, dendriticcells, macrophages, and/or stem cells. It is noted that a cellularsample can contain a collection of different cells (e.g., a mixture oflymphocytes and polymorphonuclear cells) or can contain cells of thesame type (e.g., a clonal culture of cancer cells). Examples of cancercells that can be used include, without limitation, head and neck cancercells, breast cancer cells, prostate cancer cells, lung cancer cells,colorectal cancer cells, pancreas cancer cells, glioma cells, lymphomacells, mycloma cells, and leukemia cells.

Any method can be used to isolate viruses that bind a component within asample. For example, viruses bound to an immobilized polypeptidepreparation can be isolated by (1) washing the preparation to remove anyunbound viruses, (2) adding cells known to be susceptible to viralinfection to the preparation, and (3) harvesting viral particles thatwere amplified as a result of viral infection. Once harvested, theviruses can be evaluated to determine the particular nucleic acidsequence that encoded the displayed polypeptide responsible for thebinding activity.

When using cells in vitro or in vivo, the cells can be cells that do notexpress receptors for the wild-type viruses. In the case of ALV,wild-type ALV do not infect mammalian cells since mammalian cells do notexpress receptors for ALV. Thus, the infectious ALV polypeptide displaylibraries provided herein can be incubated with mammalian cells toidentify displayed polypeptide sequences that allow ALVs to infect themammalian cells. For example, the ALV viruses provided herein can beincubated with mammalian cells. After incubation, viruses that infectedthe mammalian cells can be isolated by (1) washing the cells to removeany unbound viruses and (2) harvesting viral particles that wereamplified as a result of viral infection. Once harvested, the virusescan be evaluated to determine the particular nucleic acid sequence thatencoded the displayed polypeptide responsible for the virus particle'sability to infect the mammalian cells.

Many other methods and techniques can be used to identify displayedpolypeptide sequences having a desired activity. In fact, the methodsand techniques commonly used with phage display libraries can beemployed using the viruses and viral polypeptide display librariesprovided herein. For example, the viruses and viral polypeptide displaylibraries provided herein can be in a manner similar to the phagedisplay libraries described elsewhere (Arap et al., Science, 279:377-380(1998); Ellerby et al., Nature Med., 5:1032-1038 (1999); Pasqualini andRuoslahti, Nature, 380:364-366 (1996); Rajotte et al., J. Clin. Invest.,102:430-437 (1998); and Trepel et al., Hum. Gene Ther., 11:1971-1981(2000)).

Once a particular displayed polypeptide having a desired activity hasbeen identified, any biological molecule (e.g., cell receptors and cellglycoproteins) that interacts with that displayed polypeptide can beidentified. For example, the displayed polypeptide sequence that allowsan ALV to infect a mammalian cancer cell can be isolated or synthesizedto obtain a substantially pure polypeptide preparation. Thatsubstantially pure polypeptide preparation can be used to isolate themolecule that interacts with it via, for example, affinitychromatography. In addition, any of the common molecular biologytechniques such as expression cloning and yeast two-hybrid systems canbe using to identify polypeptides that interact with displayedpolypeptides. For example, the methods described in Smith and Petrenko(Chem. Rev., 97:391-410 (1997)) and Rajotte and Ruoslahti (J. Biol.Chem., 274:11593-11598 (1999) can be used to obtain a polypeptide thatspecifically interacts with a particular displayed polypeptide sequence.It is noted that a substantially pure polypeptide preparation of adisplayed polypeptide sequence can be used to produce antibodies. Suchantibodies can be used to help identify polypeptides that interact withdisplayed polypeptides.

The invention will be further described in the following examples, whichdo not limit the scope of the invention described in the claims.

EXAMPLES Example 1 Infectious ALV Molecular Clones with EnvelopeGlycoproteins Having Additional Polypeptide Epitopes as N-TerminalExtensions

Five constructs were generated from the ALV(A) retroviral vectorRCASBP(A)AP. This vector is described elsewhere (Federspiel and Hughes,Retroviral gene delivery. In: Muscle: Methods for Cell and MuscleResearch, Eds. Emerson and Sweeney, Academic Press. pp. 179-214 (1997)).Construct 1 contained, in the 5′ to 3′ direction, the ALV(A) retroviral5′ LTR, the gag, pol, and env viral sequences, a nucleic acid sequenceencoding an alkaline phosphatase (AP) polypeptide, and the ALV(A)retroviral 3′ LTR. Constructs 2, 3, 4, and 5 were identical to construct1 with the exception that each contained an additional nucleic acidsequence that was inserted, in frame, at the 5′ end of the env viralsequence (FIG. 1). For construct 2, the inserted nucleic acid sequenceencoded a FLAG® epitope (DYKDDDDK; SEQ ID NO:7). For construct 3, theinserted nucleic acid sequence was, in the 5′ to 3′ direction, (1) asequence that encoded a FLAG® epitope, (2) a sequence recognized by theSfiI restriction enzyme and encoding AAQPA (SEQ ID NO:8), (3) a sequenceencoding a 53-amino acid EGF ligand, (4) a sequence encoding a Factor Xacleavage site (IEGR; SEQ ID NO:9), (5) a sequence encoding a G4S linker(GGGGS; SEQ ID NO:10), and (6) a sequence recognized by the NotIrestriction enzyme and encoding AAA. For construct 4, the insertednucleic acid sequence was, in the 5′ to 3′ direction, (1) a sequencethat encoded a FLAG® epitope, (2) a sequence recognized by the SfiIrestriction enzyme and encoding AAQPA (SEQ ID NO:8), (3) a sequenceencoding a 53-amino acid EGF ligand, (4) a sequence encoding a G4Slinker, (5) a sequence encoding a Factor Xa cleavage site, (6) asequence encoding a G4S linker, and (7) a sequence recognized by theNotI restriction enzyme and encoding AAA. For construct 5, the insertednucleic acid sequence was, in the 5′ to 3′ direction, (1) a sequencethat encoded a FLAG® epitope, (2) a sequence recognized by the SfiIrestriction enzyme and encoding AAQPA (SEQ ID NO:8), (3) a sequenceencoding a 53-amino acid EGF ligand, (4) a sequence encoding a Factor Xacleavage site, (5) a sequence encoding three G4S linkers in tandem, and(6) a sequence recognized by the NotI restriction enzyme and encodingAAA.

The nucleic acid sequence encoding the FLAG® epitope, an eight aminoacid polypeptide sequence, was included so that virus particlesdisplaying the FLAG® epitope as an N-terminal extension of the ALVsurface glycoprotein could be detected using, for example, anti-FLAG®epitope antibodies. Likewise, the nucleic acid sequence encoding the EGFligand, a 53-amino acid polypeptide sequence, was included so that virusparticles displaying a properly folded EGF ligand as an N-terminalextension of the ALV surface glycoprotein could be detected using, forexample, anti-EGF ligand antibodies. The Factor Xa protease cleavagesite was included to help demonstrate the presence of the appropriateepitopes since Factor Xa could be used to cleave the polypeptideextensions from the remaining envelope sequence. Each construct alsocontains a sequence encoding an AP polypeptide to aid in monitoringvirus replication and in quantifying viral titers.

To determine if the N-terminal extensions of the envelope glycoproteinscould be tolerated in replicating viruses, plasmid DNA containing theinfectious molecular clones (constructs 1-5) was transfected intoseparate cultures of chicken fibroblast DF-1 cells, and the cultureswere passaged to allow virus production and spread. The four constructs(constructs 2-5) containing N-terminal extensions resulted in infectiousvirus production although possibly at a slower rate compared to theproduction using the wild-type construct (construct 1; FIG. 2). Inaddition, the titers of the infectious viruses was determined by serialdilution of day 20 culture supernatants. Briefly, the serially dilutedsupernatants were used to infect fresh DF-1 cells. After two days, thenumber of AP-positive cells was determined. The titer for the virusesfrom the wild-type construct (construct 1) was 5×10⁶ infectious unitsper mL (ifu/mL), while the titers for the viruses from all four chimericconstructs (constructs 2-5) were 1×10⁶ ifu/mL. These results demonstratethat ALV viruses with envelope glycoproteins having non-viral N-terminalpolypeptide extensions can replicate efficiently, reaching infectioustiters comparable to wild-type ALV viruses.

The following experiments were performed to determine whether thechimeric envelope glycoproteins were efficiently incorporated into ALVvirions. ALV virions were pelleted from 3 mL of culture supernatantsobtained from 20 day cultures. The polypeptides were denatured,separated by 12% SD S-PAGE, and analyzed by Western immunoblot. Thefilters were probed with either an anti-FLAG® epitope monoclonalantibody (1:2000 dilution; Sigma, St. Louis, Mo.), an anti-human EGFmonoclonal antibody (1.0 μg/mL; R & D Systems Inc., Minneapolis, Minn.),or rabbit anti-ALV CA polyclonal sera (1:5000 dilution; CharlesRiver/SPAFAS, North Franklin, Conn.). The rabbit anti-ALV CA polyclonalsera recognize the ALV capsid. The bound antibodies were probed witheither an anti-mouse or anti-rabbit antibody conjugated to horse-radishperoxidase (HRP). Any resulting immunocomplexes were visualized bychemiluminescence. On Western immunoblots, the estimated size of theconstruct 1 surface glycoprotein and the construct 2 surfaceglycoprotein was ˜80 kDa; the estimated size of the EGF containingsurface glycoproteins from constructs 3, 4, and 5 was ˜90 kDa; and theestimated size of the ALV capsid for each ALV was ˜26 kDa.

Western immunoblot analysis of viral particles produced by the DF-1 cellcultures demonstrated that the chimeric envelope glycoproteins wereincorporated into virions (FIG. 3). In addition, the envelopeglycoproteins containing the FLAG® and EGF epitopes (envelopeglycoproteins encoded by constructs 3, 4, and 5) were larger on theimmunoblots than the envelope glycoproteins containing the FLAG® epitopeand not the EGF epitope (envelope glycoproteins encoded by construct 2).

The following experiment was performed to confirm that the chimericenvelope glycoproteins were incorporated into virions and to determinewhether the chimeric envelope glycoproteins were sensitivity of toFactor Xa protease digestion. Virions were pelleted as described above,resuspended in OPTI-MEM (GIBCO/BRL), and digested with or without FactorXa protease (100 μg/mL; New England Biolabs, Inc.; Beverly, Mass.) at37° C. for 90 minutes. After digestion, the samples were denatured,separated by 12% SDS-PAGE, and analyzed by Western immunoblot probedwith an anti-ALV(A) SU monoclonal antibody. The bound immunocomplexeswere visualized by chemiluminescence. For each surface glycoproteincontaining the 53-amino acid EGF epitope, a shift in size was detectableafter Factor Xa digestion (FIG. 4). No shift was detected in surfaceglycoproteins from construct 1. Likewise, given the size of the FLAG®epitope, no shift was detected in surface glycoproteins from construct2. These results demonstrate that the N-terminal extensions wereaccessible to Factor Xa protease cleavage.

The following experiments were performed to determined whether thechimeric envelope glycoproteins were stable after repeated virusre-passage. Stability of the displayed epitopes on ALV glycoproteins isimportant when ALV is to be used as a polypeptide display platform sincemost selection protocols will involve the amplification of the virusesthat bound to a target. Virus stocks produced by transfecting DF-1 cellswith the infectious clone DNA were re-passaged in DF-1 cells after a lowMOI infection. Specifically, two rounds of re-passage in DF-1 cells wereperformed. For the first re-passage, DF-1 cells were infected with virusstocks from 20-day primary cultures at an MOI of 0.001. For the secondre-passage, DF-1 cells were infected with virus stocks from 12-daycultures from the first re-passage at an MOI of 0.001. In each case,virus replication was monitored by ELISA using the rabbit anti-ALV CApolyclonal sera. Virus replication was observed during both the firstand the second re-passage for each of the construct-containing ALVviruses. As expected, no virus replication was observed in mock treatedcultures.

In addition, virion glycoproteins produced by the first and secondre-passage cultures were analyzed by Western immunoblot using theanti-ALV(A) SU monoclonal antibody, the anti-FLAG® epitope monoclonalantibody, and the anti-human EGF monoclonal antibody (FIG. 6). Using theanti-ALV(A) SU monoclonal antibody, virion glycoproteins were detectedfor each tested sample (ALV from constructs 1-5) for both the first andsecond re-passages. Using the anti-FLAG® epitope monoclonal antibody,virion glycoproteins were detected for each tested sample expected tocontain the FLAG® epitope (ALV from constructs 2-5) for both the firstand second re-passages. Using the anti-human EGF monoclonal antibody,virion glycoproteins were detected for each tested sample expected tocontain the EGF epitope (ALV from constructs 3-5) for both the first andsecond re-passages. For construct 5-containing viruses, a population ofviruses lacking the FLAG® and EGF epitopes appeared to be selected overtime. From this analysis, at least three of the four tested virusesstably displayed the FLAG® epitope or the FLAG®/EGF epitopes throughboth re-passages.

To determine if the displayed non-viral epitopes on ALV(A) surfaceglycoproteins are accessible to bind target proteins, wild-type virions(from construct 1) and chimeric virions (from constructs 2-5) wereexposed to tissue culture wells coated with either anti-FLAG® oranti-EGF monoclonal antibodies. Briefly, tissue culture wells werecoated with the anti-FLAG® monoclonal antibody (0.5 μg/mL), washed withphosphate buffered saline (PBS) with 0.1% Tween-80, and blocked with PBSwith 5% fetal calf serum (FCS). Virus stocks produced by DF-1 cellstransfected with constructs 1-5 were incubated in the blocked wells at4° C. for 60 minutes. After washing the wells three times with PBS, DF-1cells were added, and the plates were incubated at 39° C. for 2 days.The cells were then fixed with 4% paraformadehyde and assayed for APactivity. Dark blue/purple cells were positive for AP activity.

AP activity was detected in the wells coated with the anti-FLAG epitopemonoclonal antibodies and containing the virions made from constructs2-5. Thus, the virions made from constructs 2-5 contained the FLAG®epitope, bound to the wells coated with anti-FLAG® epitope antibodies,and infected the DF-1 cells. AP activity also was detected in the wellscoated with the anti-EGF epitope monoclonal antibodies and containingthe virions made from constructs 3-5. Thus, the virions made fromconstructs 3-5 contained the EGF epitope, bound to the wells coated withanti-EGF epitope antibodies, and infected the DF-1 cells. No AP activitywas detected in mock controls. These mock controls were cells that werenot infected but were subjected to all the assay procedures. The resultsdemonstrated that the FLAG® and EGF epitopes displayed on the virionglycoproteins were accessible to specific binding by the appropriateantibody immobilized on a solid support.

A concern about polypeptide display on an enveloped virus is thepotential problem of the virions non-specifically binding to eukaryoticcells. To address this concern and determine if the ALV(A) virionsdisplay a functional EGF ligand, wild-type (made from construct 1) andchimeric virions (made from constructs 2-5) were incubated with thehuman tumor cell line A431. This cell line expresses high levels of thehuman EGF receptor. Briefly, virus stocks were concentrated bycentrifugation (1:10). The concentrated stocks were then incubated with1×10⁶ A431 cells in suspension (total volume 4 mL) at 4° C. for 1 hour.The virus:cell complexes were washed three times with PBS containing 2%FCS and then incubated with the soluble chicken ALV(A) receptor Tvafused to a mouse IgG (sTva-mIgG). sTva-mIgG binds specifically to ALV(A)surface glycoproteins. After washing the complexes three times with PBScontaining 2% FCS, the complexes were incubated with anti-mouse IgGconjugated to phycoerythrin, washed, resuspended in PBS containing 2%FCS, and analyzed with a Becton Dickinson FACSCalibur using CELLQuest3.1 software. Only the viruses displaying the EGF ligand bound to theA431 cells (FIG. 7). In addition, the binding was specific for the humanEGF receptor since addition of 1 μM recombinant EGF (rEGF) significantlyreduced virus binding. These results demonstrate that ALV(A) virionsdisplaying the human EGF ligand specifically bind to cells expressingthe human EGF receptor.

Taken together, these data demonstrate that viruses displaying chimericenvelope glycoproteins can be produced in high titers, and that theyretain their infectivity through multiple passages. In addition, thesedata demonstrate that epitopes within displayed chimeric envelopeglycoproteins are accessible and functional. Further, these datademonstrate the feasibility of using chimeric envelope glycoproteins todeliver or match a virus to a particular target.

Example 2 Generating an ALV Peptide Display Library

The following experiments are performed to generate and characterize ALVpolypeptide display libraries containing a diverse array ofunglycosylated and/or glycosylated polypeptides. At least threedifferent libraries of polypeptides, 10 to 12 amino acid residues inlength, are produced having either a randomized residues at allpositions, randomized residues at all positions with a fixed N-linkedglycosylation site, or randomized residues at all positions with a fixedN-linked glycosylation site flanked by cysteine residues to producecyclic peptides. The assembly of such libraries can lead to thegeneration of polypeptides having novel and more diverse bindingproperties. In fact, using 10 to 12 residue polypeptides can increasethe potential of creating unique binding motifs when compared to shorterpolypeptides.

Briefly, polypeptide libraries are generated and characterized inplasmids that contain the infectious molecular clone of ALV(A). Then,the plasmid polypeptide library is used to produce the virus library(FIG. 8). The organization of the displayed polypeptides on the ALV(A)surface glycoprotein is slightly different when compared to theorganization of constructs 3-5. Each polypeptide is displayed onreplicating ALV(A) particles as N-terminal extensions of the viralsurface envelope glycoproteins with a G4S linker being located betweenthe N-terminal extensions and surface envelope glycoprotein sequence(FIG. 9). In addition, each polypeptides is encoded by nucleic acidsequences located between SfiI and NotI cloning sites.

One library is designed to contain linear 10-mer polypeptides, X₁₀,randomized at all positions. A second library is designed to containlinear 12-mer polypeptides of the general format, X₂NXTX₇ (SEQ ID NO:16)or X₂NXSX₇ (SEQ ID NO:17), where the NXT or NXS represents a fixedN-linked glycosylation signal of three amino acids(asparagine-X-threonine or asparagine-X-serine). A third library isdesigned to contain cyclic glycosylated polypeptides of the same generalformat as the second library but containing fixed cysteines as follows:CX₂NXTX₇C (SEQ ID NO:11) or CX₂NXSX₇C (SEQ ID NO:12).

PCR randomization of the base nucleotide sequence is used to constructthe polypeptide libraries as described elsewhere (Buchholz et al., Nat.Biotech., 16:951-954 (1998)). Briefly, an oligonucleotide primer thatcontains the unique KpnI site just upstream of the env splice acceptorsite and a series of oligonucleotide primers that contain the randomizedsequence encoding the polypeptide library flanked by the SfiI and NotIsites and containing part of the signal peptide is used to amplify the˜250 bp region. To reduce the frequency of termination signals in therandom part of the oligonucleotides, the Wobble positions of the codonsare restricted to G and T residues. This restriction is designed toexclude two of the three stop codons while maintaining the inclusion ofall possible amino acid residues. The amplified product is digested withKpnI and NotI and cloned into the KpnI/NotI sites of the RCASBP(A)APdisplay vector, a plasmid containing an infectious molecular clone ofALV(A). The plasmid library is transformed into electrocompetent DH5αbacterial host cells. The scale of ligation and transformation issufficient to ensure that the library diversity is more than 10⁷independent clones in each library. Successful PCR randomization of thesequences encoding the polypeptide extensions is confirmed by DNAsequencing of at least 50 independent clones from the library.

The virus library is produced by transfecting the plasmid library intomultiple large flasks of chicken DF-1 cells using calcium phosphateprecipitation. To characterize the virus library, genomic RNA ispurified from pelleted virus particles. Once purified, the regionencoding the randomized polypeptide sequence is amplified by reversetranscription (RT)-PCR, and the resulting amplification products arecloned into a TA cloning vector for sequencing. The nucleotide sequence,size, and diversity of at least 50 cloned PCR products is determined. Astatistical analysis is performed to compare the observed frequency ofthe different amino acid residues at each randomized position in thepolypeptide with the expected frequency as described elsewhere (Buchholzet al., Nat. Biotech., 16:951-954 (1998)). The scale of the virusproduction should be enough to generate a library with a diversity ofgreater than 10⁷. Virus library titers of ˜10⁶ ifu/mL before virusconcentration are obtainable since the viruses with chimeric surfaceglycoproteins replicated to ˜10⁶ ifu/mL as demonstrated herein. Virustiters can be increased by concentrating virus using centrifugation.

Example 3 Optimizing a Polypeptide Display Library Selection Protocol

The following techniques arc used to select and identify ALV surfacepolypeptide chimeras that bind to specific ligands on targetpolypeptides or cells from a large and diverse ALV polypeptide displaylibrary. These techniques are designed to select and identify ALVsurface polypeptide chimeras through multiple rounds ofselection/amplification of the viral polypeptide chimeras that actuallybind a target ligand over those that bind non-specifically (i.e.,background).

Targets (e.g., proteins or cells) are incubated in vitro with virionsdisplaying an epitope under conditions that optimize specific binding ofthe displayed epitope to the target. Unbound virus is removed byextensive washing, and the remaining bound virus is amplified by addingDF-1 cells to allow virus infection and growth. The amplified virus poolis then subjected to additional rounds of selection (e.g., incubated invitro with the original targets) to further define the virus poolcontaining epitopes that specifically bind the target. After multiplerounds of selection, a population of virions displaying N-terminalpolypeptide extensions that specifically interact with the desiredtarget is obtained.

The number of rounds of selection/amplification necessary to identify apolypeptide is determined using different concentrations of theFLAG®-displaying ALV (e.g., virions made from construct 2 described inExample 1) seeded into stocks of wild-type ALV. For example, 1, 2, 5, or10 ifu of FLAG®-displaying ALV are added to 10⁶ ifu of wild-type ALV togenerate virus mixtures. To aid in monitoring the different viruses, theFLAG®-displaying ALV is designed to encode AP polypeptide, and thewild-type ALV is designed to encode a green fluorescent protein (GFP).The virus mixtures are incubated with anti-FLAG® monoclonal antibodiesimmobilized on culture dishes to bind virus containing the FLAG®epitope, and multiple rounds of amplification are performed. Duplicatealiquots of the virus mixtures are also titered to determine the actualFLAG®-displaying ALV ifu added. The distribution of epitopes in thevirus pool after each round of selection is determined by extractinggenomic RNA from the virus pool, amplifying the region containing thedisplayed epitope coding sequence by RT-PCR, cloning the amplifiedproducts into TA cloning vectors, and determining the nucleotidesequence of at least 50 clones. The number of rounds necessary to selectFLAG®-displaying ALV from within the virus mixtures is used as astarting point for identifying specific interactions between displayedepitopes and any desired target.

Theoretically, every possible 6-residue polypeptide should berepresented in the randomized X₁₀ ALV polypeptide display library whenthe diversity of the library approaches 10⁷. Thus, the library shouldcontain the FLAG® epitope, DYKDDDDK (SEQ ID NO:7), or at least six toseven amino acid residues of the FLAG® epitope, which could bind to theanti-FLAG® antibody. To assess the quality of the X₁₀ library and toconduct an additional test of the selection/amplification protocol, theanti-FLAG® monoclonal antibody immobilized on culture plates is used asthe target polypeptide for selection of the ALV-X₁₀ library. Multiplerounds of selection/amplification are performed, and the distribution ofdisplayed polypeptides present in the virus pool after each round ischaracterized as described above. This technique provides a test of theselection/amplification protocol. In addition, if an ALV containing theFLAG® epitope within the randomized region is selected, this indicatesthat the quality of the polypeptide library approaches or is greaterthan the theoretical calculations.

Example 4 Identifying Amino Acid Sequences that Interact with HumanCancer Cell Targets

The ALV polypeptide display technology described herein is useful tostudy any cancer related polypeptide or cell. In this example, humanbreast cancer is studied. ALV polypeptide display libraries are used toidentify novel binding ligands associated with human breast cancer intwo different in vitro selection formats: (1) purified polypeptideimmobilized on a solid support and (2) cells grown in culture.

To obtain polypeptides that specifically bind purified MUC1extracellular domain, a MUC1-GST fusion protein, consisting of five MUC1extracellular tandem repeats (20 amino acid residues each) fused to theGST epitope for purification is immobilized on culture dishes. The threeALV polypeptide display libraries can be used. The tandem repeat regionof MUC1 has only one known interaction domain, ICAM-1. It is known thatMUC1 is overexpressed and aberrantly glycosylated in most breastcarcinomas. The differences in glycosylation possibly provide uniqueepitopes on normal and aberrant MUC1 that could be identified with thepolypeptide libraries. These experiments are designed to identify otherpolypeptide interaction domains and possibly identify polypeptidecandidates by searching amino acid databases with the obtained bindingpolypeptide sequences. In this example, the selection/amplificationprotocol described in Example 2 is used. The polypeptide distribution inthe virus pool is determined after each round of selection. Putativespecific polypeptides that bind MUC1 are engineered back into the ALV(A)molecular clone (inserted between the SfiI and NotI sites), and thebinding specificity and affinity of the individual viruses to MUC1determined. Also, if appropriate, glycosylation sites are mutated todetermine the relative contribution of glycosylation to bindingaffinity.

To obtain polypeptides that specifically bind breast carcinoma cellsexpressing high levels of aberrant MUC1, a human breast carcinoma cellline that express high levels of MUC1 (e.g., MCF-7 and T47D) and a cellline with a low level or negative for MUC1 (e.g., MDA-MB-231 andMDA-MB-435) are used to select polypeptides that can differentiatebetween the two cell types. The three ALV polypeptide display librariescan be used. The polypeptide distribution in the virus pool isdetermined after each round of selection. After characterizing theputative specific polypeptides, some of the polypeptides selected thatspecifically bind MUC1 are compared to polypeptides selected using thepurified MUC1 polypeptide for differences in binding purified MUC1 andaberrant MUC1 on the carcinoma cell surface.

OTHER EMBODIMENTS

It is to be understood that while the invention has been described inconjunction with the detailed description thereof the foregoingdescription is intended to illustrate and not limit the scope of theinvention, which is defined by the scope of the appended claims. Otheraspects, advantages, and modifications are within the scope of thefollowing claims.

1. A polypeptide comprising the sequence set forth in SEQ ID NO:2 and afirst amino acid sequence, wherein said first amino acid sequence isheterologous to naturally occurring avian leukosis virus amino acidsequences, and wherein said first amino acid sequence is attached to theamino-terminal portion of said sequence set forth in SEQ ID NO:2.
 2. Thepolypeptide of claim 1, wherein said first amino acid sequence isbetween five and 500 amino acid residues in length.
 3. The polypeptideof claim 1, wherein said first amino acid sequence is between ten and250 amino acid residues in length.
 4. The polypeptide of claim 1,wherein said first amino acid sequence comprises a sequence from apolypeptide selected from the group consisting of receptors, receptorligands, immunoglobulins, enzymes, and enzyme substrates.
 5. Thepolypeptide of claim 1, wherein said polypeptide forms a covalentattachment with an avian leukosis virus transmembrane glycoprotein whensaid polypeptide is part of an avian leukosis virus.
 6. A plurality ofpolypeptides, wherein each polypeptide comprises the sequence set forthin SEQ ID NO:2 and a first amino acid sequence, wherein said first aminoacid sequence of each polypeptide is heterologous to naturally occurringavian leukosis virus amino acid sequences, and wherein said first aminoacid sequence of each polypeptide is attached to the amino-terminalportion of said sequence set forth in SEQ ID NO:2.
 7. The plurality ofpolypeptides of claim 6, wherein said first amino acid sequence of eachpolypeptide is different.
 8. The plurality of polypeptides of claim 6,wherein each polypeptide forms a covalent attachment with an avianleukosis virus transmembrane glycoprotein when part of an avian leukosisvirus.
 9. A virus comprising a first polypeptide, wherein said firstpolypeptide comprises the sequence set forth in SEQ ID NO:2 and a firstamino acid sequence, wherein said first amino acid sequence isheterologous to naturally occurring avian leukosis virus amino acidsequences, and wherein said first amino acid sequence is attached to theamino-terminal portion of said sequence set forth in SEQ ID NO:2. 10.The virus of claim 9, wherein said virus is a retrovirus.
 11. The virusof claim 9, wherein said virus is an avian leukosis virus or a murineleukemia virus.
 12. The virus of claim 9, wherein said first polypeptideforms a covalent attachment with an avian leukosis virus transmembraneglycoprotein when said first polypeptide is part of an avian leukosisvirus.
 13. The virus of claim 9, wherein said virus comprises an avianleukosis virus transmembrane glycoprotein.
 14. The virus of claim 13,wherein said first polypeptide forms a covalent attachment with saidavian leukosis virus transmembrane glycoprotein.
 15. The virus of claim9, wherein said virus comprises a nucleic acid molecule comprising afirst nucleic acid sequence, wherein said first nucleic acid sequenceencodes said first polypeptide.
 16. The virus of claim 15, wherein saidnucleic acid molecule comprises a second nucleic acid sequence, whereinsaid second nucleic acid sequence is heterologous to naturally occurringavian leukosis viruses.
 17. The virus of claim 16, wherein said secondnucleic acid sequence encodes a second polypeptide.
 18. The virus ofclaim 17, wherein said second polypeptide is selected from the groupconsisting of receptors, receptor ligands, immunoglobulins, enzymes, andenzyme substrates.
 19. The virus of claim 16, wherein said secondnucleic acid sequence is located between said first nucleic acidsequence and a 3′ LTR viral sequence.
 20. The virus of claim 9, whereinsaid virus is replication-competent.
 21. The virus of claim 9, whereinsaid virus is replication-defective.
 22. A plurality of viruses, whereineach virus comprises a first polypeptide, wherein each first polypeptidecomprises the sequence set forth in SEQ ID NO:2 and a first amino acidsequence, wherein said first amino acid sequence is heterologous tonaturally occurring avian leukosis virus amino acid sequences, andwherein said first amino acid sequence is attached to the amino-terminalportion of said sequence set forth in SEQ ID NO:2.
 23. The plurality ofviruses of claim 22, wherein said first amino acid sequence of eachfirst polypeptide is different.
 24. The plurality of viruses of claim22, wherein each virus comprises a nucleic acid molecule comprising afirst nucleic acid sequence, wherein said first nucleic acid sequenceencodes said first polypeptide.
 25. The plurality of viruses of claim24, wherein said nucleic acid molecule of each virus comprises a secondnucleic acid sequence.
 26. The plurality of viruses of claim 25, whereinsaid second nucleic acid sequence of each virus is different.
 27. Theplurality of viruses of claim 25, wherein said second nucleic acidsequence encodes a second polypeptide.
 28. The plurality of viruses ofclaim 27, wherein each virus comprises said second polypeptide.
 29. Theplurality of viruses of claim 22, wherein each virus isreplication-compent.
 30. The plurality of viruses of claim 22, whereineach virus is replication-defective.
 31. The plurality of viruses ofclaim 22, wherein said plurality is at least 500.